Overview
Dataset statistics
| Number of variables | 33 |
|---|---|
| Number of observations | 1100000 |
| Missing cells | 0 |
| Missing cells (%) | 0.0% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 841.6 MiB |
| Average record size in memory | 802.3 B |
Variable types
| DateTime | 1 |
|---|---|
| Categorical | 13 |
| Numeric | 16 |
| Text | 3 |
category is highly overall correlated with subcategory | High correlation |
channel is highly overall correlated with city and 4 other fields | High correlation |
city is highly overall correlated with channel and 4 other fields | High correlation |
country is highly overall correlated with channel and 4 other fields | High correlation |
discount_pct is highly overall correlated with promo_flag | High correlation |
gross_sales is highly overall correlated with list_price and 3 other fields | High correlation |
is_weekend is highly overall correlated with weekday | High correlation |
latitude is highly overall correlated with channel and 4 other fields | High correlation |
list_price is highly overall correlated with gross_sales and 2 other fields | High correlation |
longitude is highly overall correlated with channel and 4 other fields | High correlation |
margin_pct is highly overall correlated with promo_flag | High correlation |
month is highly overall correlated with weekofyear | High correlation |
net_sales is highly overall correlated with gross_sales and 3 other fields | High correlation |
promo_flag is highly overall correlated with discount_pct and 1 other fields | High correlation |
purchase_cost is highly overall correlated with gross_sales and 2 other fields | High correlation |
store_id is highly overall correlated with channel and 4 other fields | High correlation |
subcategory is highly overall correlated with category | High correlation |
units_sold is highly overall correlated with gross_sales and 1 other fields | High correlation |
weekday is highly overall correlated with is_weekend | High correlation |
weekofyear is highly overall correlated with month | High correlation |
is_holiday is highly imbalanced (89.6%) | Imbalance |
discount_pct is highly imbalanced (75.7%) | Imbalance |
promo_flag is highly imbalanced (59.7%) | Imbalance |
stock_out_flag is highly imbalanced (80.5%) | Imbalance |
weekday has 156713 (14.2%) zeros | Zeros |
Reproduction
| Analysis started | 2025-12-28 15:29:43.077018 |
|---|---|
| Analysis finished | 2025-12-28 15:33:04.375866 |
| Duration | 3 minutes and 21.3 seconds |
| Software version | ydata-profiling vv4.18.0 |
| Download configuration | config.json |
Variables
date
Date
| Distinct | 1095 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 8.4 MiB |
| Minimum | 2021-01-01 00:00:00 |
|---|---|
| Maximum | 2023-12-31 00:00:00 |
| Invalid dates | 0 |
| Invalid dates (%) | 0.0% |
Length
| Max length | 4 |
|---|---|
| Median length | 4 |
| Mean length | 4 |
| Min length | 4 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 2021 |
|---|---|
| 2nd row | 2021 |
| 3rd row | 2021 |
| 4th row | 2021 |
| 5th row | 2021 |
Common Values
| Value | Count | Frequency (%) |
| 2021 | 366825 | |
| 2022 | 366715 | |
| 2023 | 366460 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 2021 | 366825 | |
| 2022 | 366715 | |
| 2023 | 366460 |
Most occurring characters
| Value | Count | Frequency (%) |
| 2 | 2566715 | |
| 0 | 1100000 | |
| 1 | 366825 | 8.3% |
| 3 | 366460 | 8.3% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 4400000 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 2 | 2566715 | |
| 0 | 1100000 | |
| 1 | 366825 | 8.3% |
| 3 | 366460 | 8.3% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 4400000 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 2 | 2566715 | |
| 0 | 1100000 | |
| 1 | 366825 | 8.3% |
| 3 | 366460 | 8.3% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 4400000 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 2 | 2566715 | |
| 0 | 1100000 | |
| 1 | 366825 | 8.3% |
| 3 | 366460 | 8.3% |
month
Real number (ℝ)
High correlation
| Distinct | 12 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 6.5256127 |
| Minimum | 1 |
|---|---|
| Maximum | 12 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 8.4 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 4 |
| median | 7 |
| Q3 | 10 |
| 95-th percentile | 12 |
| Maximum | 12 |
| Range | 11 |
| Interquartile range (IQR) | 6 |
Descriptive statistics
| Standard deviation | 3.4477598 |
|---|---|
| Coefficient of variation (CV) | 0.52834269 |
| Kurtosis | -1.2069738 |
| Mean | 6.5256127 |
| Median Absolute Deviation (MAD) | 3 |
| Skewness | -0.010304752 |
| Sum | 7178174 |
| Variance | 11.887048 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1 | 93434 | |
| 3 | 93434 | |
| 7 | 93434 | |
| 5 | 93434 | |
| 8 | 93434 | |
| 10 | 93403 | |
| 12 | 93403 | |
| 4 | 90420 | |
| 6 | 90420 | |
| 9 | 90402 | |
| Other values (2) | 174782 |
| Value | Count | Frequency (%) |
| 1 | 93434 | |
| 2 | 84392 | |
| 3 | 93434 | |
| 4 | 90420 | |
| 5 | 93434 | |
| 6 | 90420 | |
| 7 | 93434 | |
| 8 | 93434 | |
| 9 | 90402 | |
| 10 | 93403 |
| Value | Count | Frequency (%) |
| 12 | 93403 | |
| 11 | 90390 | |
| 10 | 93403 | |
| 9 | 90402 | |
| 8 | 93434 | |
| 7 | 93434 | |
| 6 | 90420 | |
| 5 | 93434 | |
| 4 | 90420 | |
| 3 | 93434 |
day
Real number (ℝ)
| Distinct | 31 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 15.720444 |
| Minimum | 1 |
|---|---|
| Maximum | 31 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 8.4 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 2 |
| Q1 | 8 |
| median | 16 |
| Q3 | 23 |
| 95-th percentile | 29 |
| Maximum | 31 |
| Range | 30 |
| Interquartile range (IQR) | 15 |
Descriptive statistics
| Standard deviation | 8.7962619 |
|---|---|
| Coefficient of variation (CV) | 0.55954286 |
| Kurtosis | -1.1931587 |
| Mean | 15.720444 |
| Median Absolute Deviation (MAD) | 8 |
| Skewness | 0.0075380274 |
| Sum | 17292488 |
| Variance | 77.374224 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1 | 36165 | 3.3% |
| 2 | 36165 | 3.3% |
| 3 | 36165 | 3.3% |
| 4 | 36165 | 3.3% |
| 5 | 36165 | 3.3% |
| 6 | 36165 | 3.3% |
| 7 | 36165 | 3.3% |
| 8 | 36165 | 3.3% |
| 9 | 36165 | 3.3% |
| 10 | 36165 | 3.3% |
| Other values (21) | 738350 |
| Value | Count | Frequency (%) |
| 1 | 36165 | |
| 2 | 36165 | |
| 3 | 36165 | |
| 4 | 36165 | |
| 5 | 36165 | |
| 6 | 36165 | |
| 7 | 36165 | |
| 8 | 36165 | |
| 9 | 36165 | |
| 10 | 36165 |
| Value | Count | Frequency (%) |
| 31 | 21096 | |
| 30 | 33150 | |
| 29 | 33150 | |
| 28 | 36164 | |
| 27 | 36164 | |
| 26 | 36164 | |
| 25 | 36164 | |
| 24 | 36164 | |
| 23 | 36164 | |
| 22 | 36164 |
weekofyear
Real number (ℝ)
High correlation
| Distinct | 53 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 26.570811 |
| Minimum | 1 |
|---|---|
| Maximum | 53 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 8.4 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 3 |
| Q1 | 14 |
| median | 27 |
| Q3 | 40 |
| 95-th percentile | 50 |
| Maximum | 53 |
| Range | 52 |
| Interquartile range (IQR) | 26 |
Descriptive statistics
| Standard deviation | 15.051255 |
|---|---|
| Coefficient of variation (CV) | 0.56645826 |
| Kurtosis | -1.2001142 |
| Mean | 26.570811 |
| Median Absolute Deviation (MAD) | 13 |
| Skewness | 0.00063724589 |
| Sum | 29227892 |
| Variance | 226.54028 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1 | 21098 | 1.9% |
| 3 | 21098 | 1.9% |
| 2 | 21098 | 1.9% |
| 4 | 21098 | 1.9% |
| 5 | 21098 | 1.9% |
| 13 | 21098 | 1.9% |
| 6 | 21098 | 1.9% |
| 7 | 21098 | 1.9% |
| 8 | 21098 | 1.9% |
| 9 | 21098 | 1.9% |
| Other values (43) | 889020 |
| Value | Count | Frequency (%) |
| 1 | 21098 | |
| 2 | 21098 | |
| 3 | 21098 | |
| 4 | 21098 | |
| 5 | 21098 | |
| 6 | 21098 | |
| 7 | 21098 | |
| 8 | 21098 | |
| 9 | 21098 | |
| 10 | 21098 |
| Value | Count | Frequency (%) |
| 53 | 3015 | 0.3% |
| 52 | 21091 | |
| 51 | 21091 | |
| 50 | 21091 | |
| 49 | 21091 | |
| 48 | 21091 | |
| 47 | 21091 | |
| 46 | 21091 | |
| 45 | 21091 | |
| 44 | 21091 |
weekday
Real number (ℝ)
High correlation Zeros
| Distinct | 7 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 3.0054791 |
| Minimum | 0 |
|---|---|
| Maximum | 6 |
| Zeros | 156713 |
| Zeros (%) | 14.2% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 8.4 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 1 |
| median | 3 |
| Q3 | 5 |
| 95-th percentile | 6 |
| Maximum | 6 |
| Range | 6 |
| Interquartile range (IQR) | 4 |
Descriptive statistics
| Standard deviation | 2.0004513 |
|---|---|
| Coefficient of variation (CV) | 0.66560147 |
| Kurtosis | -1.250774 |
| Mean | 3.0054791 |
| Median Absolute Deviation (MAD) | 2 |
| Skewness | -0.004111291 |
| Sum | 3306027 |
| Variance | 4.0018054 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 4 | 157717 | |
| 5 | 157717 | |
| 6 | 157717 | |
| 0 | 156713 | |
| 1 | 156712 | |
| 2 | 156712 | |
| 3 | 156712 |
| Value | Count | Frequency (%) |
| 0 | 156713 | |
| 1 | 156712 | |
| 2 | 156712 | |
| 3 | 156712 | |
| 4 | 157717 | |
| 5 | 157717 | |
| 6 | 157717 |
| Value | Count | Frequency (%) |
| 6 | 157717 | |
| 5 | 157717 | |
| 4 | 157717 | |
| 3 | 156712 | |
| 2 | 156712 | |
| 1 | 156712 | |
| 0 | 156713 |
is_weekend
Categorical
High correlation
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 52.5 MiB |
| 0 | |
|---|---|
| 1 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 0 |
|---|---|
| 2nd row | 1 |
| 3rd row | 1 |
| 4th row | 0 |
| 5th row | 0 |
Common Values
| Value | Count | Frequency (%) |
| 0 | 784566 | |
| 1 | 315434 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 0 | 784566 | |
| 1 | 315434 |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 784566 | |
| 1 | 315434 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 1100000 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 784566 | |
| 1 | 315434 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 1100000 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 784566 | |
| 1 | 315434 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 1100000 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 784566 | |
| 1 | 315434 |
is_holiday
Categorical
Imbalance
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 52.5 MiB |
| 0 | |
|---|---|
| 1 | 15068 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 1 |
|---|---|
| 2nd row | 0 |
| 3rd row | 0 |
| 4th row | 0 |
| 5th row | 0 |
Common Values
| Value | Count | Frequency (%) |
| 0 | 1084932 | |
| 1 | 15068 | 1.4% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 0 | 1084932 | |
| 1 | 15068 | 1.4% |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 1084932 | |
| 1 | 15068 | 1.4% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 1100000 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 1084932 | |
| 1 | 15068 | 1.4% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 1100000 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 1084932 | |
| 1 | 15068 | 1.4% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 1100000 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 1084932 | |
| 1 | 15068 | 1.4% |
temperature
Real number (ℝ)
| Distinct | 710 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 12.815005 |
| Minimum | 1.8 |
|---|---|
| Maximum | 22.83 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 8.4 MiB |
Quantile statistics
| Minimum | 1.8 |
|---|---|
| 5-th percentile | 7.25 |
| Q1 | 10.61 |
| median | 12.84 |
| Q3 | 15 |
| 95-th percentile | 18.53 |
| Maximum | 22.83 |
| Range | 21.03 |
| Interquartile range (IQR) | 4.39 |
Descriptive statistics
| Standard deviation | 3.3715868 |
|---|---|
| Coefficient of variation (CV) | 0.26309679 |
| Kurtosis | -0.048365811 |
| Mean | 12.815005 |
| Median Absolute Deviation (MAD) | 2.21 |
| Skewness | -0.060420163 |
| Sum | 14096506 |
| Variance | 11.367598 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 12.31 | 5024 | 0.5% |
| 12.02 | 5024 | 0.5% |
| 14.41 | 5023 | 0.5% |
| 13.74 | 5022 | 0.5% |
| 11.01 | 4020 | 0.4% |
| 13.13 | 4020 | 0.4% |
| 14.81 | 4019 | 0.4% |
| 14.8 | 4019 | 0.4% |
| 13.48 | 4019 | 0.4% |
| 12.99 | 4019 | 0.4% |
| Other values (700) | 1055791 |
| Value | Count | Frequency (%) |
| 1.8 | 1005 | |
| 2.74 | 1005 | |
| 3.19 | 1005 | |
| 3.62 | 1004 | |
| 3.88 | 1005 | |
| 4.04 | 1004 | |
| 4.13 | 1005 | |
| 4.27 | 2009 | |
| 4.56 | 1004 | |
| 4.89 | 1004 |
| Value | Count | Frequency (%) |
| 22.83 | 1005 | |
| 22.59 | 1004 | |
| 22.3 | 1005 | |
| 21.58 | 1005 | |
| 21.54 | 1005 | |
| 21.28 | 1004 | |
| 21.26 | 1005 | |
| 20.83 | 1004 | |
| 20.6 | 1005 | |
| 20.36 | 1004 |
rain_mm
Real number (ℝ)
| Distinct | 559 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 2.9041064 |
| Minimum | 0 |
|---|---|
| Maximum | 11.58 |
| Zeros | 1005 |
| Zeros (%) | 0.1% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 8.4 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0.23 |
| Q1 | 1.22 |
| median | 2.57 |
| Q3 | 4.15 |
| 95-th percentile | 6.85 |
| Maximum | 11.58 |
| Range | 11.58 |
| Interquartile range (IQR) | 2.93 |
Descriptive statistics
| Standard deviation | 2.0989968 |
|---|---|
| Coefficient of variation (CV) | 0.72276859 |
| Kurtosis | 0.7851252 |
| Mean | 2.9041064 |
| Median Absolute Deviation (MAD) | 1.46 |
| Skewness | 0.91618052 |
| Sum | 3194517 |
| Variance | 4.4057877 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 2.06 | 6029 | 0.5% |
| 0.69 | 6027 | 0.5% |
| 0.03 | 6027 | 0.5% |
| 0.89 | 6026 | 0.5% |
| 1.22 | 5025 | 0.5% |
| 0.71 | 5025 | 0.5% |
| 1.77 | 5024 | 0.5% |
| 1 | 5024 | 0.5% |
| 1.43 | 5024 | 0.5% |
| 2.51 | 5024 | 0.5% |
| Other values (549) | 1045745 |
| Value | Count | Frequency (%) |
| 0 | 1005 | 0.1% |
| 0.01 | 2010 | 0.2% |
| 0.02 | 1004 | 0.1% |
| 0.03 | 6027 | |
| 0.04 | 3014 | |
| 0.05 | 4018 | |
| 0.06 | 2010 | 0.2% |
| 0.07 | 3013 | |
| 0.08 | 4020 | |
| 0.09 | 2008 | 0.2% |
| Value | Count | Frequency (%) |
| 11.58 | 1005 | |
| 11.41 | 1005 | |
| 11.33 | 1004 | |
| 10.96 | 1004 | |
| 10.85 | 1004 | |
| 10.28 | 1005 | |
| 9.93 | 1005 | |
| 9.86 | 1004 | |
| 9.77 | 1005 | |
| 9.28 | 1004 |
store_id
Categorical
High correlation
| Distinct | 13 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 60.8 MiB |
| STORE0001 | |
|---|---|
| STORE0002 | |
| STORE0003 | |
| STORE0004 | |
| STORE0005 | |
| Other values (8) |
Length
| Max length | 9 |
|---|---|
| Median length | 9 |
| Mean length | 9 |
| Min length | 9 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | STORE0001 |
|---|---|
| 2nd row | STORE0001 |
| 3rd row | STORE0001 |
| 4th row | STORE0001 |
| 5th row | STORE0001 |
Common Values
| Value | Count | Frequency (%) |
| STORE0001 | 87600 | 8.0% |
| STORE0002 | 87600 | 8.0% |
| STORE0003 | 87600 | 8.0% |
| STORE0004 | 87600 | 8.0% |
| STORE0005 | 87600 | 8.0% |
| STORE0006 | 87600 | 8.0% |
| STORE0007 | 87600 | 8.0% |
| STORE0008 | 87600 | 8.0% |
| STORE0009 | 87600 | 8.0% |
| STORE0010 | 87600 | 8.0% |
| Other values (3) | 224000 |
Length
| Value | Count | Frequency (%) |
| store0001 | 87600 | 8.0% |
| store0002 | 87600 | 8.0% |
| store0003 | 87600 | 8.0% |
| store0004 | 87600 | 8.0% |
| store0005 | 87600 | 8.0% |
| store0006 | 87600 | 8.0% |
| store0007 | 87600 | 8.0% |
| store0008 | 87600 | 8.0% |
| store0009 | 87600 | 8.0% |
| store0010 | 87600 | 8.0% |
| Other values (3) | 224000 |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 3076000 | |
| T | 1100000 | 11.1% |
| S | 1100000 | 11.1% |
| O | 1100000 | 11.1% |
| R | 1100000 | 11.1% |
| E | 1100000 | 11.1% |
| 1 | 486800 | 4.9% |
| 2 | 175200 | 1.8% |
| 3 | 136400 | 1.4% |
| 4 | 87600 | 0.9% |
| Other values (5) | 438000 | 4.4% |
Most occurring categories
| Value | Count | Frequency (%) |
| Uppercase Letter | 5500000 | |
| Decimal Number | 4400000 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 3076000 | |
| 1 | 486800 | 11.1% |
| 2 | 175200 | 4.0% |
| 3 | 136400 | 3.1% |
| 4 | 87600 | 2.0% |
| 5 | 87600 | 2.0% |
| 6 | 87600 | 2.0% |
| 7 | 87600 | 2.0% |
| 8 | 87600 | 2.0% |
| 9 | 87600 | 2.0% |
Uppercase Letter
| Value | Count | Frequency (%) |
| T | 1100000 | |
| S | 1100000 | |
| O | 1100000 | |
| R | 1100000 | |
| E | 1100000 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 5500000 | |
| Common | 4400000 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 3076000 | |
| 1 | 486800 | 11.1% |
| 2 | 175200 | 4.0% |
| 3 | 136400 | 3.1% |
| 4 | 87600 | 2.0% |
| 5 | 87600 | 2.0% |
| 6 | 87600 | 2.0% |
| 7 | 87600 | 2.0% |
| 8 | 87600 | 2.0% |
| 9 | 87600 | 2.0% |
Latin
| Value | Count | Frequency (%) |
| T | 1100000 | |
| S | 1100000 | |
| O | 1100000 | |
| R | 1100000 | |
| E | 1100000 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 9900000 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 3076000 | |
| T | 1100000 | 11.1% |
| S | 1100000 | 11.1% |
| O | 1100000 | 11.1% |
| R | 1100000 | 11.1% |
| E | 1100000 | 11.1% |
| 1 | 486800 | 4.9% |
| 2 | 175200 | 1.8% |
| 3 | 136400 | 1.4% |
| 4 | 87600 | 0.9% |
| Other values (5) | 438000 | 4.4% |
country
Categorical
High correlation
| Distinct | 7 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 57.6 MiB |
| Italy | |
|---|---|
| Spain | |
| Germany | |
| Poland | |
| France | |
| Other values (2) |
Length
| Max length | 11 |
|---|---|
| Median length | 5 |
| Mean length | 5.9032727 |
| Min length | 5 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Germany |
|---|---|
| 2nd row | Germany |
| 3rd row | Germany |
| 4th row | Germany |
| 5th row | Germany |
Common Values
| Value | Count | Frequency (%) |
| Italy | 350400 | |
| Spain | 262800 | |
| Germany | 175200 | |
| Poland | 87600 | 8.0% |
| France | 87600 | 8.0% |
| Austria | 87600 | 8.0% |
| Netherlands | 48800 | 4.4% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| italy | 350400 | |
| spain | 262800 | |
| germany | 175200 | |
| poland | 87600 | 8.0% |
| france | 87600 | 8.0% |
| austria | 87600 | 8.0% |
| netherlands | 48800 | 4.4% |
Most occurring characters
| Value | Count | Frequency (%) |
| a | 1100000 | |
| n | 662000 | |
| y | 525600 | 8.1% |
| l | 486800 | 7.5% |
| t | 486800 | 7.5% |
| r | 399200 | 6.1% |
| e | 360400 | 5.6% |
| I | 350400 | 5.4% |
| i | 350400 | 5.4% |
| p | 262800 | 4.0% |
| Other values (13) | 1509200 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 5393600 | |
| Uppercase Letter | 1100000 | 16.9% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| a | 1100000 | |
| n | 662000 | |
| y | 525600 | |
| l | 486800 | |
| t | 486800 | |
| r | 399200 | 7.4% |
| e | 360400 | 6.7% |
| i | 350400 | 6.5% |
| p | 262800 | 4.9% |
| m | 175200 | 3.2% |
| Other values (6) | 584400 |
Uppercase Letter
| Value | Count | Frequency (%) |
| I | 350400 | |
| S | 262800 | |
| G | 175200 | |
| P | 87600 | 8.0% |
| F | 87600 | 8.0% |
| A | 87600 | 8.0% |
| N | 48800 | 4.4% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 6493600 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| a | 1100000 | |
| n | 662000 | |
| y | 525600 | 8.1% |
| l | 486800 | 7.5% |
| t | 486800 | 7.5% |
| r | 399200 | 6.1% |
| e | 360400 | 5.6% |
| I | 350400 | 5.4% |
| i | 350400 | 5.4% |
| p | 262800 | 4.0% |
| Other values (13) | 1509200 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 6493600 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| a | 1100000 | |
| n | 662000 | |
| y | 525600 | 8.1% |
| l | 486800 | 7.5% |
| t | 486800 | 7.5% |
| r | 399200 | 6.1% |
| e | 360400 | 5.6% |
| I | 350400 | 5.4% |
| i | 350400 | 5.4% |
| p | 262800 | 4.0% |
| Other values (13) | 1509200 |
city
Categorical
High correlation
| Distinct | 9 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 57.8 MiB |
| Berlin | |
|---|---|
| Rome | |
| Milan | |
| Barcelona | |
| Warsaw | |
| Other values (4) |
Length
| Max length | 9 |
|---|---|
| Median length | 6 |
| Mean length | 6.0534545 |
| Min length | 4 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Berlin |
|---|---|
| 2nd row | Berlin |
| 3rd row | Berlin |
| 4th row | Berlin |
| 5th row | Berlin |
Common Values
| Value | Count | Frequency (%) |
| Berlin | 175200 | |
| Rome | 175200 | |
| Milan | 175200 | |
| Barcelona | 175200 | |
| Warsaw | 87600 | |
| Paris | 87600 | |
| Vienna | 87600 | |
| Madrid | 87600 | |
| Amsterdam | 48800 | 4.4% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| berlin | 175200 | |
| rome | 175200 | |
| milan | 175200 | |
| barcelona | 175200 | |
| warsaw | 87600 | |
| paris | 87600 | |
| vienna | 87600 | |
| madrid | 87600 | |
| amsterdam | 48800 | 4.4% |
Most occurring characters
| Value | Count | Frequency (%) |
| a | 1012400 | |
| n | 700800 | |
| e | 662000 | |
| r | 662000 | |
| i | 613200 | |
| l | 525600 | |
| B | 350400 | 5.3% |
| o | 350400 | 5.3% |
| m | 272800 | 4.1% |
| M | 262800 | 3.9% |
| Other values (10) | 1246400 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 5558800 | |
| Uppercase Letter | 1100000 | 16.5% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| a | 1012400 | |
| n | 700800 | |
| e | 662000 | |
| r | 662000 | |
| i | 613200 | |
| l | 525600 | |
| o | 350400 | 6.3% |
| m | 272800 | 4.9% |
| d | 224000 | 4.0% |
| s | 224000 | 4.0% |
| Other values (3) | 311600 | 5.6% |
Uppercase Letter
| Value | Count | Frequency (%) |
| B | 350400 | |
| M | 262800 | |
| R | 175200 | |
| W | 87600 | 8.0% |
| P | 87600 | 8.0% |
| V | 87600 | 8.0% |
| A | 48800 | 4.4% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 6658800 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| a | 1012400 | |
| n | 700800 | |
| e | 662000 | |
| r | 662000 | |
| i | 613200 | |
| l | 525600 | |
| B | 350400 | 5.3% |
| o | 350400 | 5.3% |
| m | 272800 | 4.1% |
| M | 262800 | 3.9% |
| Other values (10) | 1246400 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 6658800 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| a | 1012400 | |
| n | 700800 | |
| e | 662000 | |
| r | 662000 | |
| i | 613200 | |
| l | 525600 | |
| B | 350400 | 5.3% |
| o | 350400 | 5.3% |
| m | 272800 | 4.1% |
| M | 262800 | 3.9% |
| Other values (10) | 1246400 |
channel
Categorical
High correlation
| Distinct | 4 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 62.7 MiB |
| Hypermarket | |
|---|---|
| Supermarket | |
| E-commerce | |
| Convenience | 48800 |
Length
| Max length | 11 |
|---|---|
| Median length | 11 |
| Mean length | 10.761091 |
| Min length | 10 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Hypermarket |
|---|---|
| 2nd row | Hypermarket |
| 3rd row | Hypermarket |
| 4th row | Hypermarket |
| 5th row | Hypermarket |
Common Values
| Value | Count | Frequency (%) |
| Hypermarket | 525600 | |
| Supermarket | 262800 | |
| E-commerce | 262800 | |
| Convenience | 48800 | 4.4% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| hypermarket | 525600 | |
| supermarket | 262800 | |
| e-commerce | 262800 | |
| convenience | 48800 | 4.4% |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 2248800 | |
| r | 1839600 | |
| m | 1314000 | |
| k | 788400 | 6.7% |
| t | 788400 | 6.7% |
| p | 788400 | 6.7% |
| a | 788400 | 6.7% |
| c | 574400 | 4.9% |
| H | 525600 | 4.4% |
| y | 525600 | 4.4% |
| Other values (9) | 1655600 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 10474400 | |
| Uppercase Letter | 1100000 | 9.3% |
| Dash Punctuation | 262800 | 2.2% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 2248800 | |
| r | 1839600 | |
| m | 1314000 | |
| k | 788400 | 7.5% |
| t | 788400 | 7.5% |
| p | 788400 | 7.5% |
| a | 788400 | 7.5% |
| c | 574400 | 5.5% |
| y | 525600 | 5.0% |
| o | 311600 | 3.0% |
| Other values (4) | 506800 | 4.8% |
Uppercase Letter
| Value | Count | Frequency (%) |
| H | 525600 | |
| S | 262800 | |
| E | 262800 | |
| C | 48800 | 4.4% |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 262800 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 11574400 | |
| Common | 262800 | 2.2% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| e | 2248800 | |
| r | 1839600 | |
| m | 1314000 | |
| k | 788400 | 6.8% |
| t | 788400 | 6.8% |
| p | 788400 | 6.8% |
| a | 788400 | 6.8% |
| c | 574400 | 5.0% |
| H | 525600 | 4.5% |
| y | 525600 | 4.5% |
| Other values (8) | 1392800 |
Common
| Value | Count | Frequency (%) |
| - | 262800 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 11837200 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| e | 2248800 | |
| r | 1839600 | |
| m | 1314000 | |
| k | 788400 | 6.7% |
| t | 788400 | 6.7% |
| p | 788400 | 6.7% |
| a | 788400 | 6.7% |
| c | 574400 | 4.9% |
| H | 525600 | 4.4% |
| y | 525600 | 4.4% |
| Other values (9) | 1655600 |
latitude
Real number (ℝ)
High correlation
| Distinct | 13 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 46.305811 |
| Minimum | 40.41706 |
|---|---|
| Maximum | 52.52586 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 8.4 MiB |
Quantile statistics
| Minimum | 40.41706 |
|---|---|
| 5-th percentile | 40.41706 |
| Q1 | 41.90833 |
| median | 45.46266 |
| Q3 | 52.25287 |
| 95-th percentile | 52.52586 |
| Maximum | 52.52586 |
| Range | 12.1088 |
| Interquartile range (IQR) | 10.34454 |
Descriptive statistics
| Standard deviation | 4.6029269 |
|---|---|
| Coefficient of variation (CV) | 0.099402791 |
| Kurtosis | -1.555797 |
| Mean | 46.305811 |
| Median Absolute Deviation (MAD) | 4.06112 |
| Skewness | 0.18305392 |
| Sum | 50936392 |
| Variance | 21.186936 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 52.52586 | 87600 | 8.0% |
| 41.94012 | 87600 | 8.0% |
| 45.44037 | 87600 | 8.0% |
| 52.50051 | 87600 | 8.0% |
| 52.25287 | 87600 | 8.0% |
| 48.87587 | 87600 | 8.0% |
| 41.40154 | 87600 | 8.0% |
| 48.20329 | 87600 | 8.0% |
| 41.36731 | 87600 | 8.0% |
| 45.46266 | 87600 | 8.0% |
| Other values (3) | 224000 |
| Value | Count | Frequency (%) |
| 40.41706 | 87600 | |
| 41.36731 | 87600 | |
| 41.40154 | 87600 | |
| 41.90833 | 87600 | |
| 41.94012 | 87600 | |
| 45.44037 | 87600 | |
| 45.46266 | 87600 | |
| 48.20329 | 87600 | |
| 48.87587 | 87600 | |
| 52.25287 | 87600 |
| Value | Count | Frequency (%) |
| 52.52586 | 87600 | |
| 52.50051 | 87600 | |
| 52.36231 | 48800 | |
| 52.25287 | 87600 | |
| 48.87587 | 87600 | |
| 48.20329 | 87600 | |
| 45.46266 | 87600 | |
| 45.44037 | 87600 | |
| 41.94012 | 87600 | |
| 41.90833 | 87600 |
longitude
Real number (ℝ)
High correlation
| Distinct | 13 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 9.030856 |
| Minimum | -3.67473 |
|---|---|
| Maximum | 20.99579 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 87600 |
| Negative (%) | 8.0% |
| Memory size | 8.4 MiB |
Quantile statistics
| Minimum | -3.67473 |
|---|---|
| 5-th percentile | -3.67473 |
| Q1 | 2.36046 |
| median | 9.20313 |
| Q3 | 13.39071 |
| 95-th percentile | 20.99579 |
| Maximum | 20.99579 |
| Range | 24.67052 |
| Interquartile range (IQR) | 11.03025 |
Descriptive statistics
| Standard deviation | 6.7274441 |
|---|---|
| Coefficient of variation (CV) | 0.7449398 |
| Kurtosis | -0.78421852 |
| Mean | 9.030856 |
| Median Absolute Deviation (MAD) | 4.24463 |
| Skewness | -0.17507884 |
| Sum | 9933941.6 |
| Variance | 45.258504 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 13.39071 | 87600 | 8.0% |
| 12.50588 | 87600 | 8.0% |
| 9.20313 | 87600 | 8.0% |
| 13.42074 | 87600 | 8.0% |
| 20.99579 | 87600 | 8.0% |
| 2.36046 | 87600 | 8.0% |
| 2.21134 | 87600 | 8.0% |
| 16.35873 | 87600 | 8.0% |
| 2.15708 | 87600 | 8.0% |
| 9.19682 | 87600 | 8.0% |
| Other values (3) | 224000 |
| Value | Count | Frequency (%) |
| -3.67473 | 87600 | |
| 2.15708 | 87600 | |
| 2.21134 | 87600 | |
| 2.36046 | 87600 | |
| 4.9585 | 48800 | |
| 9.19682 | 87600 | |
| 9.20313 | 87600 | |
| 12.50588 | 87600 | |
| 12.51294 | 87600 | |
| 13.39071 | 87600 |
| Value | Count | Frequency (%) |
| 20.99579 | 87600 | |
| 16.35873 | 87600 | |
| 13.42074 | 87600 | |
| 13.39071 | 87600 | |
| 12.51294 | 87600 | |
| 12.50588 | 87600 | |
| 9.20313 | 87600 | |
| 9.19682 | 87600 | |
| 4.9585 | 48800 | |
| 2.36046 | 87600 |
sku_id
Text
| Distinct | 102 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 58.7 MiB |
Length
| Max length | 7 |
|---|---|
| Median length | 7 |
| Mean length | 7 |
| Min length | 7 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | SKU0086 |
|---|---|
| 2nd row | SKU0086 |
| 3rd row | SKU0086 |
| 4th row | SKU0086 |
| 5th row | SKU0086 |
| Value | Count | Frequency (%) |
| sku0020 | 14235 | 1.3% |
| sku0001 | 14235 | 1.3% |
| sku0096 | 14235 | 1.3% |
| sku0039 | 14235 | 1.3% |
| sku0015 | 13140 | 1.2% |
| sku0006 | 13140 | 1.2% |
| sku0056 | 13140 | 1.2% |
| sku0028 | 13140 | 1.2% |
| sku0097 | 13140 | 1.2% |
| sku0087 | 13140 | 1.2% |
| Other values (92) | 964220 |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 2416810 | |
| S | 1100000 | |
| K | 1100000 | |
| U | 1100000 | |
| 1 | 256230 | 3.3% |
| 6 | 229950 | 3.0% |
| 8 | 224475 | 2.9% |
| 5 | 222285 | 2.9% |
| 2 | 214620 | 2.8% |
| 3 | 214145 | 2.8% |
| Other values (3) | 621485 | 8.1% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 4400000 | |
| Uppercase Letter | 3300000 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 2416810 | |
| 1 | 256230 | 5.8% |
| 6 | 229950 | 5.2% |
| 8 | 224475 | 5.1% |
| 5 | 222285 | 5.1% |
| 2 | 214620 | 4.9% |
| 3 | 214145 | 4.9% |
| 7 | 209765 | 4.8% |
| 9 | 208050 | 4.7% |
| 4 | 203670 | 4.6% |
Uppercase Letter
| Value | Count | Frequency (%) |
| S | 1100000 | |
| K | 1100000 | |
| U | 1100000 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 4400000 | |
| Latin | 3300000 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 2416810 | |
| 1 | 256230 | 5.8% |
| 6 | 229950 | 5.2% |
| 8 | 224475 | 5.1% |
| 5 | 222285 | 5.1% |
| 2 | 214620 | 4.9% |
| 3 | 214145 | 4.9% |
| 7 | 209765 | 4.8% |
| 9 | 208050 | 4.7% |
| 4 | 203670 | 4.6% |
Latin
| Value | Count | Frequency (%) |
| S | 1100000 | |
| K | 1100000 | |
| U | 1100000 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 7700000 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 2416810 | |
| S | 1100000 | |
| K | 1100000 | |
| U | 1100000 | |
| 1 | 256230 | 3.3% |
| 6 | 229950 | 3.0% |
| 8 | 224475 | 2.9% |
| 5 | 222285 | 2.9% |
| 2 | 214620 | 2.8% |
| 3 | 214145 | 2.8% |
| Other values (3) | 621485 | 8.1% |
sku_name
Text
| Distinct | 102 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 65.7 MiB |
Length
| Max length | 19 |
|---|---|
| Median length | 16 |
| Mean length | 13.629259 |
| Min length | 11 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | BrandB Shampoo |
|---|---|
| 2nd row | BrandB Shampoo |
| 3rd row | BrandB Shampoo |
| 4th row | BrandB Shampoo |
| 5th row | BrandB Shampoo |
| Value | Count | Frequency (%) |
| brandf | 187245 | 8.3% |
| brandb | 186150 | 8.2% |
| brandd | 186150 | 8.2% |
| branda | 182390 | 8.1% |
| brandc | 180675 | 8.0% |
| brande | 177390 | 7.8% |
| soda | 71175 | 3.1% |
| toothpaste | 70080 | 3.1% |
| shampoo | 70080 | 3.1% |
| cheese | 68985 | 3.0% |
| Other values (14) | 882095 |
Most occurring characters
| Value | Count | Frequency (%) |
| a | 1556615 | 10.4% |
| r | 1527670 | 10.2% |
| n | 1401745 | 9.3% |
| B | 1354040 | 9.0% |
| d | 1233590 | 8.2% |
| 1162415 | 7.8% | |
| e | 949510 | 6.3% |
| o | 662000 | 4.4% |
| t | 642290 | 4.3% |
| C | 435810 | 2.9% |
| Other values (23) | 4066500 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 10529770 | |
| Uppercase Letter | 3300000 | 22.0% |
| Space Separator | 1162415 | 7.8% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| a | 1556615 | |
| r | 1527670 | |
| n | 1401745 | |
| d | 1233590 | |
| e | 949510 | |
| o | 662000 | |
| t | 642290 | |
| s | 404055 | 3.8% |
| i | 398580 | 3.8% |
| h | 340545 | 3.2% |
| Other values (9) | 1413170 |
Uppercase Letter
| Value | Count | Frequency (%) |
| B | 1354040 | |
| C | 435810 | 13.2% |
| S | 263420 | 8.0% |
| D | 245280 | 7.4% |
| E | 239805 | 7.3% |
| F | 187245 | 5.7% |
| A | 182390 | 5.5% |
| T | 70080 | 2.1% |
| M | 67890 | 2.1% |
| W | 66795 | 2.0% |
| Other values (3) | 187245 | 5.7% |
Space Separator
| Value | Count | Frequency (%) |
| 1162415 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 13829770 | |
| Common | 1162415 | 7.8% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| a | 1556615 | |
| r | 1527670 | |
| n | 1401745 | 10.1% |
| B | 1354040 | 9.8% |
| d | 1233590 | 8.9% |
| e | 949510 | 6.9% |
| o | 662000 | 4.8% |
| t | 642290 | 4.6% |
| C | 435810 | 3.2% |
| s | 404055 | 2.9% |
| Other values (22) | 3662445 |
Common
| Value | Count | Frequency (%) |
| 1162415 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 14992185 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| a | 1556615 | 10.4% |
| r | 1527670 | 10.2% |
| n | 1401745 | 9.3% |
| B | 1354040 | 9.0% |
| d | 1233590 | 8.2% |
| 1162415 | 7.8% | |
| e | 949510 | 6.3% |
| o | 662000 | 4.4% |
| t | 642290 | 4.3% |
| C | 435810 | 2.9% |
| Other values (23) | 4066500 |
category
Categorical
High correlation
| Distinct | 5 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 60.1 MiB |
| Beverages | |
|---|---|
| Snacks | |
| Personal Care | |
| Dairy | |
| Home Care |
Length
| Max length | 13 |
|---|---|
| Median length | 9 |
| Mean length | 8.2982045 |
| Min length | 5 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Personal Care |
|---|---|
| 2nd row | Personal Care |
| 3rd row | Personal Care |
| 4th row | Personal Care |
| 5th row | Personal Care |
Common Values
| Value | Count | Frequency (%) |
| Beverages | 266085 | |
| Snacks | 261705 | |
| Personal Care | 199290 | |
| Dairy | 196005 | |
| Home Care | 176915 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| care | 376205 | |
| beverages | 266085 | |
| snacks | 261705 | |
| personal | 199290 | |
| dairy | 196005 | |
| home | 176915 |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 1550665 | |
| a | 1299290 | |
| r | 1037585 | |
| s | 727080 | 8.0% |
| n | 460995 | 5.1% |
| C | 376205 | 4.1% |
| 376205 | 4.1% | |
| o | 376205 | 4.1% |
| B | 266085 | 2.9% |
| v | 266085 | 2.9% |
| Other values (11) | 2391625 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 7275615 | |
| Uppercase Letter | 1476205 | 16.2% |
| Space Separator | 376205 | 4.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 1550665 | |
| a | 1299290 | |
| r | 1037585 | |
| s | 727080 | |
| n | 460995 | 6.3% |
| o | 376205 | 5.2% |
| v | 266085 | 3.7% |
| g | 266085 | 3.7% |
| c | 261705 | 3.6% |
| k | 261705 | 3.6% |
| Other values (4) | 768215 |
Uppercase Letter
| Value | Count | Frequency (%) |
| C | 376205 | |
| B | 266085 | |
| S | 261705 | |
| P | 199290 | |
| D | 196005 | |
| H | 176915 |
Space Separator
| Value | Count | Frequency (%) |
| 376205 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 8751820 | |
| Common | 376205 | 4.1% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| e | 1550665 | |
| a | 1299290 | |
| r | 1037585 | |
| s | 727080 | 8.3% |
| n | 460995 | 5.3% |
| C | 376205 | 4.3% |
| o | 376205 | 4.3% |
| B | 266085 | 3.0% |
| v | 266085 | 3.0% |
| g | 266085 | 3.0% |
| Other values (10) | 2125540 |
Common
| Value | Count | Frequency (%) |
| 376205 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 9128025 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| e | 1550665 | |
| a | 1299290 | |
| r | 1037585 | |
| s | 727080 | 8.0% |
| n | 460995 | 5.1% |
| C | 376205 | 4.1% |
| 376205 | 4.1% | |
| o | 376205 | 4.1% |
| B | 266085 | 2.9% |
| v | 266085 | 2.9% |
| Other values (11) | 2391625 |
subcategory
Categorical
High correlation
| Distinct | 17 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 58.4 MiB |
| Soda | 71175 |
|---|---|
| Shampoo | 70080 |
| Toothpaste | 70080 |
| Cheese | 68985 |
| Milk | 67890 |
| Other values (12) |
Length
| Max length | 12 |
|---|---|
| Median length | 9 |
| Mean length | 6.6292591 |
| Min length | 4 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Shampoo |
|---|---|
| 2nd row | Shampoo |
| 3rd row | Shampoo |
| 4th row | Shampoo |
| 5th row | Shampoo |
Common Values
| Value | Count | Frequency (%) |
| Soda | 71175 | 6.5% |
| Shampoo | 70080 | 6.4% |
| Toothpaste | 70080 | 6.4% |
| Cheese | 68985 | 6.3% |
| Milk | 67890 | 6.2% |
| Biscuits | 67890 | 6.2% |
| Chips | 66795 | 6.1% |
| Water | 66795 | 6.1% |
| Juice | 65700 | 6.0% |
| Chocolate | 64605 | 5.9% |
| Other values (7) | 420005 |
Length
| Value | Count | Frequency (%) |
| soda | 71175 | 6.1% |
| shampoo | 70080 | 6.0% |
| toothpaste | 70080 | 6.0% |
| cheese | 68985 | 5.9% |
| milk | 67890 | 5.8% |
| biscuits | 67890 | 5.8% |
| chips | 66795 | 5.7% |
| water | 66795 | 5.7% |
| juice | 65700 | 5.7% |
| chocolate | 64605 | 5.6% |
| Other values (8) | 482420 |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 949510 | 13.0% |
| o | 662000 | 9.1% |
| t | 642290 | 8.8% |
| a | 456615 | 6.3% |
| r | 427670 | 5.9% |
| s | 404055 | 5.5% |
| i | 398580 | 5.5% |
| h | 340545 | 4.7% |
| n | 301745 | 4.1% |
| p | 266085 | 3.6% |
| Other values (21) | 2443090 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 6129770 | |
| Uppercase Letter | 1100000 | 15.1% |
| Space Separator | 62415 | 0.9% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 949510 | |
| o | 662000 | |
| t | 642290 | |
| a | 456615 | 7.4% |
| r | 427670 | 7.0% |
| s | 404055 | 6.6% |
| i | 398580 | 6.5% |
| h | 340545 | 5.6% |
| n | 301745 | 4.9% |
| p | 266085 | 4.3% |
| Other values (9) | 1280675 |
Uppercase Letter
| Value | Count | Frequency (%) |
| S | 263420 | |
| C | 255135 | |
| T | 70080 | 6.4% |
| M | 67890 | 6.2% |
| B | 67890 | 6.2% |
| W | 66795 | 6.1% |
| J | 65700 | 6.0% |
| N | 62415 | 5.7% |
| E | 62415 | 5.7% |
| D | 59130 | 5.4% |
Space Separator
| Value | Count | Frequency (%) |
| 62415 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 7229770 | |
| Common | 62415 | 0.9% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| e | 949510 | 13.1% |
| o | 662000 | 9.2% |
| t | 642290 | 8.9% |
| a | 456615 | 6.3% |
| r | 427670 | 5.9% |
| s | 404055 | 5.6% |
| i | 398580 | 5.5% |
| h | 340545 | 4.7% |
| n | 301745 | 4.2% |
| p | 266085 | 3.7% |
| Other values (20) | 2380675 |
Common
| Value | Count | Frequency (%) |
| 62415 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 7292185 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| e | 949510 | 13.0% |
| o | 662000 | 9.1% |
| t | 642290 | 8.8% |
| a | 456615 | 6.3% |
| r | 427670 | 5.9% |
| s | 404055 | 5.5% |
| i | 398580 | 5.5% |
| h | 340545 | 4.7% |
| n | 301745 | 4.1% |
| p | 266085 | 3.6% |
| Other values (21) | 2443090 |
brand
Categorical
| Distinct | 6 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 57.7 MiB |
| BrandF | |
|---|---|
| BrandB | |
| BrandD | |
| BrandA | |
| BrandC |
Length
| Max length | 6 |
|---|---|
| Median length | 6 |
| Mean length | 6 |
| Min length | 6 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | BrandB |
|---|---|
| 2nd row | BrandB |
| 3rd row | BrandB |
| 4th row | BrandB |
| 5th row | BrandB |
Common Values
| Value | Count | Frequency (%) |
| BrandF | 187245 | |
| BrandB | 186150 | |
| BrandD | 186150 | |
| BrandA | 182390 | |
| BrandC | 180675 | |
| BrandE | 177390 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| brandf | 187245 | |
| brandb | 186150 | |
| brandd | 186150 | |
| branda | 182390 | |
| brandc | 180675 | |
| brande | 177390 |
Most occurring characters
| Value | Count | Frequency (%) |
| B | 1286150 | |
| r | 1100000 | |
| a | 1100000 | |
| n | 1100000 | |
| d | 1100000 | |
| F | 187245 | 2.8% |
| D | 186150 | 2.8% |
| A | 182390 | 2.8% |
| C | 180675 | 2.7% |
| E | 177390 | 2.7% |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 4400000 | |
| Uppercase Letter | 2200000 |
Most frequent character per category
Uppercase Letter
| Value | Count | Frequency (%) |
| B | 1286150 | |
| F | 187245 | 8.5% |
| D | 186150 | 8.5% |
| A | 182390 | 8.3% |
| C | 180675 | 8.2% |
| E | 177390 | 8.1% |
Lowercase Letter
| Value | Count | Frequency (%) |
| r | 1100000 | |
| a | 1100000 | |
| n | 1100000 | |
| d | 1100000 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 6600000 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| B | 1286150 | |
| r | 1100000 | |
| a | 1100000 | |
| n | 1100000 | |
| d | 1100000 | |
| F | 187245 | 2.8% |
| D | 186150 | 2.8% |
| A | 182390 | 2.8% |
| C | 180675 | 2.7% |
| E | 177390 | 2.7% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 6600000 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| B | 1286150 | |
| r | 1100000 | |
| a | 1100000 | |
| n | 1100000 | |
| d | 1100000 | |
| F | 187245 | 2.8% |
| D | 186150 | 2.8% |
| A | 182390 | 2.8% |
| C | 180675 | 2.7% |
| E | 177390 | 2.7% |
units_sold
Real number (ℝ)
High correlation
| Distinct | 516 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 59.196347 |
| Minimum | 0 |
|---|---|
| Maximum | 704 |
| Zeros | 3092 |
| Zeros (%) | 0.3% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 8.4 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 10 |
| Q1 | 25 |
| median | 49 |
| Q3 | 82 |
| 95-th percentile | 142 |
| Maximum | 704 |
| Range | 704 |
| Interquartile range (IQR) | 57 |
Descriptive statistics
| Standard deviation | 45.007217 |
|---|---|
| Coefficient of variation (CV) | 0.76030395 |
| Kurtosis | 5.3705509 |
| Mean | 59.196347 |
| Median Absolute Deviation (MAD) | 27 |
| Skewness | 1.6714468 |
| Sum | 65115982 |
| Variance | 2025.6496 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 18 | 15187 | 1.4% |
| 17 | 14936 | 1.4% |
| 20 | 14911 | 1.4% |
| 19 | 14851 | 1.4% |
| 21 | 14810 | 1.3% |
| 23 | 14524 | 1.3% |
| 22 | 14477 | 1.3% |
| 16 | 14435 | 1.3% |
| 15 | 14417 | 1.3% |
| 24 | 14221 | 1.3% |
| Other values (506) | 953231 |
| Value | Count | Frequency (%) |
| 0 | 3092 | 0.3% |
| 1 | 2840 | 0.3% |
| 2 | 2878 | 0.3% |
| 3 | 3241 | 0.3% |
| 4 | 4012 | |
| 5 | 5097 | |
| 6 | 6190 | |
| 7 | 7693 | |
| 8 | 8912 | |
| 9 | 9843 |
| Value | Count | Frequency (%) |
| 704 | 1 | |
| 691 | 1 | |
| 680 | 1 | |
| 627 | 1 | |
| 626 | 1 | |
| 601 | 1 | |
| 582 | 1 | |
| 580 | 1 | |
| 577 | 1 | |
| 576 | 1 |
list_price
Real number (ℝ)
High correlation
| Distinct | 99 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 7.7121003 |
| Minimum | 1.08 |
|---|---|
| Maximum | 14.8 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 8.4 MiB |
Quantile statistics
| Minimum | 1.08 |
|---|---|
| 5-th percentile | 1.44 |
| Q1 | 4.2 |
| median | 7.38 |
| Q3 | 11.65 |
| 95-th percentile | 14.15 |
| Maximum | 14.8 |
| Range | 13.72 |
| Interquartile range (IQR) | 7.45 |
Descriptive statistics
| Standard deviation | 4.2530229 |
|---|---|
| Coefficient of variation (CV) | 0.55147401 |
| Kurtosis | -1.3109158 |
| Mean | 7.7121003 |
| Median Absolute Deviation (MAD) | 3.78 |
| Skewness | 0.057888721 |
| Sum | 8483310.3 |
| Variance | 18.088203 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 5.26 | 22995 | 2.1% |
| 9.87 | 20805 | 1.9% |
| 13.42 | 20805 | 1.9% |
| 11.8 | 14235 | 1.3% |
| 10.58 | 14235 | 1.3% |
| 6.24 | 14235 | 1.3% |
| 2.3 | 14235 | 1.3% |
| 9.29 | 13140 | 1.2% |
| 8.17 | 13140 | 1.2% |
| 9.37 | 13140 | 1.2% |
| Other values (89) | 939035 |
| Value | Count | Frequency (%) |
| 1.08 | 12045 | |
| 1.1 | 13140 | |
| 1.29 | 13140 | |
| 1.36 | 10950 | |
| 1.44 | 9855 | |
| 1.48 | 9855 | |
| 1.57 | 10950 | |
| 1.63 | 10950 | |
| 1.72 | 10950 | |
| 1.81 | 10950 |
| Value | Count | Frequency (%) |
| 14.8 | 9855 | |
| 14.57 | 10950 | |
| 14.52 | 12045 | |
| 14.47 | 9855 | |
| 14.2 | 10950 | |
| 14.15 | 10950 | |
| 14.11 | 10950 | |
| 14.02 | 8760 | |
| 13.95 | 9855 | |
| 13.72 | 12045 |
discount_pct
Categorical
High correlation Imbalance
| Distinct | 5 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 54.6 MiB |
| 0.0 | |
|---|---|
| 0.15 | 22367 |
| 0.1 | 22274 |
| 0.2 | 22086 |
| 0.3 | 21530 |
Length
| Max length | 4 |
|---|---|
| Median length | 3 |
| Mean length | 3.0203336 |
| Min length | 3 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 0.1 |
|---|---|
| 2nd row | 0.0 |
| 3rd row | 0.3 |
| 4th row | 0.0 |
| 5th row | 0.2 |
Common Values
| Value | Count | Frequency (%) |
| 0.0 | 1011743 | |
| 0.15 | 22367 | 2.0% |
| 0.1 | 22274 | 2.0% |
| 0.2 | 22086 | 2.0% |
| 0.3 | 21530 | 2.0% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 0.0 | 1011743 | |
| 0.15 | 22367 | 2.0% |
| 0.1 | 22274 | 2.0% |
| 0.2 | 22086 | 2.0% |
| 0.3 | 21530 | 2.0% |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 2111743 | |
| . | 1100000 | |
| 1 | 44641 | 1.3% |
| 5 | 22367 | 0.7% |
| 2 | 22086 | 0.7% |
| 3 | 21530 | 0.6% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 2222367 | |
| Other Punctuation | 1100000 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 2111743 | |
| 1 | 44641 | 2.0% |
| 5 | 22367 | 1.0% |
| 2 | 22086 | 1.0% |
| 3 | 21530 | 1.0% |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 1100000 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 3322367 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 2111743 | |
| . | 1100000 | |
| 1 | 44641 | 1.3% |
| 5 | 22367 | 0.7% |
| 2 | 22086 | 0.7% |
| 3 | 21530 | 0.6% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 3322367 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 2111743 | |
| . | 1100000 | |
| 1 | 44641 | 1.3% |
| 5 | 22367 | 0.7% |
| 2 | 22086 | 0.7% |
| 3 | 21530 | 0.6% |
promo_flag
Categorical
High correlation Imbalance
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 52.5 MiB |
| 0 | |
|---|---|
| 1 | 88257 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 1 |
|---|---|
| 2nd row | 0 |
| 3rd row | 1 |
| 4th row | 0 |
| 5th row | 1 |
Common Values
| Value | Count | Frequency (%) |
| 0 | 1011743 | |
| 1 | 88257 | 8.0% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 0 | 1011743 | |
| 1 | 88257 | 8.0% |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 1011743 | |
| 1 | 88257 | 8.0% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 1100000 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 1011743 | |
| 1 | 88257 | 8.0% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 1100000 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 1011743 | |
| 1 | 88257 | 8.0% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 1100000 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 1011743 | |
| 1 | 88257 | 8.0% |
gross_sales
Real number (ℝ)
High correlation
| Distinct | 15354 |
|---|---|
| Distinct (%) | 1.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 440.68061 |
| Minimum | 0 |
|---|---|
| Maximum | 6593.9 |
| Zeros | 3092 |
| Zeros (%) | 0.3% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 8.4 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 44.91 |
| Q1 | 132.44 |
| median | 282.88 |
| Q3 | 605.28 |
| 95-th percentile | 1344.25 |
| Maximum | 6593.9 |
| Range | 6593.9 |
| Interquartile range (IQR) | 472.84 |
Descriptive statistics
| Standard deviation | 441.80051 |
|---|---|
| Coefficient of variation (CV) | 1.0025413 |
| Kurtosis | 5.9618175 |
| Mean | 440.68061 |
| Median Absolute Deviation (MAD) | 186.17 |
| Skewness | 2.0096014 |
| Sum | 4.8474867 × 108 |
| Variance | 195187.69 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 3092 | 0.3% |
| 200.6 | 1384 | 0.1% |
| 83.92 | 1374 | 0.1% |
| 73.43 | 1350 | 0.1% |
| 62.94 | 1217 | 0.1% |
| 94.41 | 1203 | 0.1% |
| 67.2 | 1182 | 0.1% |
| 46.2 | 1158 | 0.1% |
| 102.19 | 1116 | 0.1% |
| 92.9 | 1098 | 0.1% |
| Other values (15344) | 1085826 |
| Value | Count | Frequency (%) |
| 0 | 3092 | |
| 1.08 | 16 | < 0.1% |
| 1.1 | 21 | < 0.1% |
| 1.29 | 9 | < 0.1% |
| 1.36 | 69 | < 0.1% |
| 1.44 | 12 | < 0.1% |
| 1.48 | 11 | < 0.1% |
| 1.57 | 14 | < 0.1% |
| 1.63 | 27 | < 0.1% |
| 1.72 | 29 | < 0.1% |
| Value | Count | Frequency (%) |
| 6593.9 | 1 | |
| 5843.95 | 1 | |
| 5716.6 | 1 | |
| 5660 | 1 | |
| 5617.55 | 1 | |
| 5575.1 | 1 | |
| 5532.65 | 1 | |
| 5461.9 | 1 | |
| 5447.75 | 2 | |
| 5377 | 1 |
net_sales
Real number (ℝ)
High correlation
| Distinct | 29938 |
|---|---|
| Distinct (%) | 2.7% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 429.95132 |
| Minimum | 0 |
|---|---|
| Maximum | 5144.94 |
| Zeros | 3092 |
| Zeros (%) | 0.3% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 8.4 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 44.88 |
| Q1 | 130.7675 |
| median | 277.86 |
| Q3 | 593.46 |
| 95-th percentile | 1307.25 |
| Maximum | 5144.94 |
| Range | 5144.94 |
| Interquartile range (IQR) | 462.6925 |
Descriptive statistics
| Standard deviation | 422.4992 |
|---|---|
| Coefficient of variation (CV) | 0.98266753 |
| Kurtosis | 4.2226402 |
| Mean | 429.95132 |
| Median Absolute Deviation (MAD) | 181.72 |
| Skewness | 1.819917 |
| Sum | 4.7294645 × 108 |
| Variance | 178505.57 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 3092 | 0.3% |
| 200.6 | 1384 | 0.1% |
| 83.92 | 1307 | 0.1% |
| 73.43 | 1293 | 0.1% |
| 62.94 | 1174 | 0.1% |
| 67.2 | 1164 | 0.1% |
| 46.2 | 1150 | 0.1% |
| 94.41 | 1124 | 0.1% |
| 102.19 | 1116 | 0.1% |
| 92.9 | 1108 | 0.1% |
| Other values (29928) | 1086088 |
| Value | Count | Frequency (%) |
| 0 | 3092 | |
| 0.77 | 1 | < 0.1% |
| 0.88 | 1 | < 0.1% |
| 1.08 | 16 | < 0.1% |
| 1.1 | 19 | < 0.1% |
| 1.16 | 1 | < 0.1% |
| 1.29 | 8 | < 0.1% |
| 1.34 | 1 | < 0.1% |
| 1.36 | 69 | < 0.1% |
| 1.44 | 12 | < 0.1% |
| Value | Count | Frequency (%) |
| 5144.94 | 1 | |
| 4811 | 1 | |
| 4630.59 | 1 | |
| 4615.73 | 1 | |
| 4584.6 | 1 | |
| 4495.46 | 1 | |
| 4494.04 | 1 | |
| 4486.68 | 1 | |
| 4471.4 | 1 | |
| 4460.08 | 1 |
stock_on_hand
Real number (ℝ)
| Distinct | 640 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 299.47566 |
| Minimum | 0 |
|---|---|
| Maximum | 698 |
| Zeros | 108 |
| Zeros (%) | < 0.1% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 8.4 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 168 |
| Q1 | 245 |
| median | 300 |
| Q3 | 354 |
| 95-th percentile | 431 |
| Maximum | 698 |
| Range | 698 |
| Interquartile range (IQR) | 109 |
Descriptive statistics
| Standard deviation | 80.072923 |
|---|---|
| Coefficient of variation (CV) | 0.26737706 |
| Kurtosis | -0.011136862 |
| Mean | 299.47566 |
| Median Absolute Deviation (MAD) | 54 |
| Skewness | -0.00089299311 |
| Sum | 3.2942323 × 108 |
| Variance | 6411.673 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 306 | 5608 | 0.5% |
| 292 | 5562 | 0.5% |
| 310 | 5554 | 0.5% |
| 293 | 5546 | 0.5% |
| 305 | 5541 | 0.5% |
| 303 | 5532 | 0.5% |
| 311 | 5521 | 0.5% |
| 295 | 5519 | 0.5% |
| 304 | 5510 | 0.5% |
| 302 | 5499 | 0.5% |
| Other values (630) | 1044608 |
| Value | Count | Frequency (%) |
| 0 | 108 | |
| 1 | 7 | < 0.1% |
| 2 | 4 | < 0.1% |
| 3 | 4 | < 0.1% |
| 4 | 5 | < 0.1% |
| 5 | 7 | < 0.1% |
| 6 | 6 | < 0.1% |
| 7 | 9 | < 0.1% |
| 8 | 7 | < 0.1% |
| 9 | 10 | < 0.1% |
| Value | Count | Frequency (%) |
| 698 | 1 | |
| 687 | 1 | |
| 672 | 1 | |
| 670 | 1 | |
| 664 | 1 | |
| 659 | 1 | |
| 649 | 1 | |
| 647 | 1 | |
| 642 | 1 | |
| 641 | 1 |
stock_out_flag
Categorical
Imbalance
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 52.5 MiB |
| 0 | |
|---|---|
| 1 | 33114 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 0 |
|---|---|
| 2nd row | 0 |
| 3rd row | 0 |
| 4th row | 0 |
| 5th row | 0 |
Common Values
| Value | Count | Frequency (%) |
| 0 | 1066886 | |
| 1 | 33114 | 3.0% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 0 | 1066886 | |
| 1 | 33114 | 3.0% |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 1066886 | |
| 1 | 33114 | 3.0% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 1100000 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 1066886 | |
| 1 | 33114 | 3.0% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 1100000 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 1066886 | |
| 1 | 33114 | 3.0% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 1100000 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 1066886 | |
| 1 | 33114 | 3.0% |
lead_time_days
Real number (ℝ)
| Distinct | 17 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 6.5004036 |
| Minimum | 1 |
|---|---|
| Maximum | 17 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 8.4 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 3 |
| Q1 | 5 |
| median | 6 |
| Q3 | 8 |
| 95-th percentile | 10 |
| Maximum | 17 |
| Range | 16 |
| Interquartile range (IQR) | 3 |
Descriptive statistics
| Standard deviation | 2.0140646 |
|---|---|
| Coefficient of variation (CV) | 0.30983685 |
| Kurtosis | -0.064369839 |
| Mean | 6.5004036 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | 0.016007546 |
| Sum | 7150444 |
| Variance | 4.0564563 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 7 | 211021 | |
| 6 | 210825 | |
| 5 | 165291 | |
| 8 | 164959 | |
| 9 | 100678 | |
| 4 | 100437 | |
| 10 | 48325 | 4.4% |
| 3 | 48237 | 4.4% |
| 2 | 18332 | 1.7% |
| 11 | 18307 | 1.7% |
| Other values (7) | 13588 | 1.2% |
| Value | Count | Frequency (%) |
| 1 | 6942 | 0.6% |
| 2 | 18332 | 1.7% |
| 3 | 48237 | 4.4% |
| 4 | 100437 | |
| 5 | 165291 | |
| 6 | 210825 | |
| 7 | 211021 | |
| 8 | 164959 | |
| 9 | 100678 | |
| 10 | 48325 | 4.4% |
| Value | Count | Frequency (%) |
| 17 | 1 | < 0.1% |
| 16 | 3 | < 0.1% |
| 15 | 35 | < 0.1% |
| 14 | 195 | < 0.1% |
| 13 | 1162 | 0.1% |
| 12 | 5250 | 0.5% |
| 11 | 18307 | 1.7% |
| 10 | 48325 | 4.4% |
| 9 | 100678 | |
| 8 | 164959 |
supplier_id
Text
| Distinct | 60 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 55.6 MiB |
Length
| Max length | 4 |
|---|---|
| Median length | 4 |
| Mean length | 4 |
| Min length | 4 |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | S008 |
|---|---|
| 2nd row | S057 |
| 3rd row | S017 |
| 4th row | S012 |
| 5th row | S038 |
| Value | Count | Frequency (%) |
| s037 | 18579 | 1.7% |
| s051 | 18572 | 1.7% |
| s041 | 18533 | 1.7% |
| s040 | 18529 | 1.7% |
| s020 | 18507 | 1.7% |
| s019 | 18502 | 1.7% |
| s007 | 18481 | 1.7% |
| s060 | 18474 | 1.7% |
| s028 | 18469 | 1.7% |
| s024 | 18469 | 1.7% |
| Other values (50) | 914885 |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 1375714 | |
| S | 1100000 | |
| 4 | 293847 | 6.7% |
| 1 | 293284 | 6.7% |
| 2 | 293150 | 6.7% |
| 5 | 292959 | 6.7% |
| 3 | 292837 | 6.7% |
| 6 | 128062 | 2.9% |
| 7 | 110414 | 2.5% |
| 9 | 110056 | 2.5% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 3300000 | |
| Uppercase Letter | 1100000 | 25.0% |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 1375714 | |
| 4 | 293847 | 8.9% |
| 1 | 293284 | 8.9% |
| 2 | 293150 | 8.9% |
| 5 | 292959 | 8.9% |
| 3 | 292837 | 8.9% |
| 6 | 128062 | 3.9% |
| 7 | 110414 | 3.3% |
| 9 | 110056 | 3.3% |
| 8 | 109677 | 3.3% |
Uppercase Letter
| Value | Count | Frequency (%) |
| S | 1100000 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 3300000 | |
| Latin | 1100000 | 25.0% |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 1375714 | |
| 4 | 293847 | 8.9% |
| 1 | 293284 | 8.9% |
| 2 | 293150 | 8.9% |
| 5 | 292959 | 8.9% |
| 3 | 292837 | 8.9% |
| 6 | 128062 | 3.9% |
| 7 | 110414 | 3.3% |
| 9 | 110056 | 3.3% |
| 8 | 109677 | 3.3% |
Latin
| Value | Count | Frequency (%) |
| S | 1100000 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 4400000 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 1375714 | |
| S | 1100000 | |
| 4 | 293847 | 6.7% |
| 1 | 293284 | 6.7% |
| 2 | 293150 | 6.7% |
| 5 | 292959 | 6.7% |
| 3 | 292837 | 6.7% |
| 6 | 128062 | 2.9% |
| 7 | 110414 | 2.5% |
| 9 | 110056 | 2.5% |
purchase_cost
Real number (ℝ)
High correlation
| Distinct | 1062 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 4.6255461 |
| Minimum | 0.49 |
|---|---|
| Maximum | 11.1 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 8.4 MiB |
Quantile statistics
| Minimum | 0.49 |
|---|---|
| 5-th percentile | 0.84 |
| Q1 | 2.4 |
| median | 4.35 |
| Q3 | 6.77 |
| 95-th percentile | 9.18 |
| Maximum | 11.1 |
| Range | 10.61 |
| Interquartile range (IQR) | 4.37 |
Descriptive statistics
| Standard deviation | 2.6626044 |
|---|---|
| Coefficient of variation (CV) | 0.57563028 |
| Kurtosis | -1.0081105 |
| Mean | 4.6255461 |
| Median Absolute Deviation (MAD) | 2.21 |
| Skewness | 0.25392754 |
| Sum | 5088100.7 |
| Variance | 7.0894621 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0.78 | 2470 | 0.2% |
| 0.8 | 2440 | 0.2% |
| 0.79 | 2439 | 0.2% |
| 0.82 | 2362 | 0.2% |
| 0.77 | 2337 | 0.2% |
| 0.81 | 2335 | 0.2% |
| 0.95 | 2324 | 0.2% |
| 0.96 | 2301 | 0.2% |
| 0.76 | 2296 | 0.2% |
| 0.94 | 2289 | 0.2% |
| Other values (1052) | 1076407 |
| Value | Count | Frequency (%) |
| 0.49 | 302 | < 0.1% |
| 0.5 | 772 | |
| 0.51 | 782 | |
| 0.52 | 754 | |
| 0.53 | 806 | |
| 0.54 | 796 | |
| 0.55 | 814 | |
| 0.56 | 776 | |
| 0.57 | 752 | |
| 0.58 | 893 |
| Value | Count | Frequency (%) |
| 11.1 | 14 | |
| 11.09 | 23 | |
| 11.08 | 23 | |
| 11.07 | 26 | |
| 11.06 | 22 | |
| 11.05 | 18 | |
| 11.04 | 26 | |
| 11.03 | 21 | |
| 11.02 | 26 | |
| 11.01 | 27 |
margin_pct
Real number (ℝ)
High correlation
| Distinct | 601 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.38524047 |
| Minimum | -0.05 |
|---|---|
| Maximum | 0.55 |
| Zeros | 75 |
| Zeros (%) | < 0.1% |
| Negative | 3570 |
| Negative (%) | 0.3% |
| Memory size | 8.4 MiB |
Quantile statistics
| Minimum | -0.05 |
|---|---|
| 5-th percentile | 0.25 |
| Q1 | 0.312 |
| median | 0.389 |
| Q3 | 0.469 |
| 95-th percentile | 0.534 |
| Maximum | 0.55 |
| Range | 0.6 |
| Interquartile range (IQR) | 0.157 |
Descriptive statistics
| Standard deviation | 0.10245806 |
|---|---|
| Coefficient of variation (CV) | 0.26595872 |
| Kurtosis | 0.61135897 |
| Mean | 0.38524047 |
| Median Absolute Deviation (MAD) | 0.079 |
| Skewness | -0.59936561 |
| Sum | 423764.51 |
| Variance | 0.010497654 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0.313 | 3719 | 0.3% |
| 0.343 | 3705 | 0.3% |
| 0.255 | 3705 | 0.3% |
| 0.261 | 3703 | 0.3% |
| 0.347 | 3696 | 0.3% |
| 0.297 | 3682 | 0.3% |
| 0.283 | 3678 | 0.3% |
| 0.336 | 3674 | 0.3% |
| 0.344 | 3672 | 0.3% |
| 0.33 | 3669 | 0.3% |
| Other values (591) | 1063097 |
| Value | Count | Frequency (%) |
| -0.05 | 32 | < 0.1% |
| -0.049 | 71 | |
| -0.048 | 57 | |
| -0.047 | 72 | |
| -0.046 | 62 | |
| -0.045 | 75 | |
| -0.044 | 84 | |
| -0.043 | 83 | |
| -0.042 | 84 | |
| -0.041 | 75 |
| Value | Count | Frequency (%) |
| 0.55 | 1678 | |
| 0.549 | 3336 | |
| 0.548 | 3373 | |
| 0.547 | 3409 | |
| 0.546 | 3415 | |
| 0.545 | 3292 | |
| 0.544 | 3309 | |
| 0.543 | 3386 | |
| 0.542 | 3394 | |
| 0.541 | 3397 |
Interactions
Correlations
| brand | category | channel | city | country | day | discount_pct | gross_sales | is_holiday | is_weekend | latitude | lead_time_days | list_price | longitude | margin_pct | month | net_sales | promo_flag | purchase_cost | rain_mm | stock_on_hand | stock_out_flag | store_id | subcategory | temperature | units_sold | weekday | weekofyear | year | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| brand | 1.000 | 0.027 | 0.045 | 0.051 | 0.044 | 0.000 | 0.034 | 0.071 | 0.000 | 0.000 | 0.038 | 0.000 | 0.311 | 0.046 | 0.023 | 0.000 | 0.074 | 0.067 | 0.154 | 0.000 | 0.000 | 0.003 | 0.060 | 0.058 | 0.000 | 0.064 | 0.000 | 0.000 | 0.000 |
| category | 0.027 | 1.000 | 0.030 | 0.044 | 0.040 | 0.000 | 0.028 | 0.072 | 0.000 | 0.000 | 0.029 | 0.000 | 0.307 | 0.039 | 0.022 | 0.000 | 0.083 | 0.056 | 0.141 | 0.000 | 0.001 | 0.000 | 0.053 | 1.000 | 0.000 | 0.074 | 0.000 | 0.000 | 0.000 |
| channel | 0.045 | 0.030 | 1.000 | 0.833 | 0.785 | 0.000 | 0.004 | 0.083 | 0.000 | 0.000 | 0.592 | 0.001 | 0.051 | 0.761 | 0.003 | 0.000 | 0.088 | 0.008 | 0.034 | 0.000 | 0.000 | 0.000 | 1.000 | 0.071 | 0.000 | 0.114 | 0.000 | 0.000 | 0.000 |
| city | 0.051 | 0.044 | 0.833 | 1.000 | 1.000 | 0.000 | 0.005 | 0.045 | 0.000 | 0.000 | 1.000 | 0.000 | 0.047 | 1.000 | 0.003 | 0.000 | 0.048 | 0.011 | 0.028 | 0.000 | 0.001 | 0.001 | 1.000 | 0.067 | 0.000 | 0.058 | 0.000 | 0.000 | 0.000 |
| country | 0.044 | 0.040 | 0.785 | 1.000 | 1.000 | 0.000 | 0.005 | 0.050 | 0.000 | 0.000 | 0.866 | 0.000 | 0.048 | 0.853 | 0.003 | 0.000 | 0.053 | 0.011 | 0.031 | 0.000 | 0.002 | 0.000 | 1.000 | 0.072 | 0.000 | 0.064 | 0.000 | 0.000 | 0.000 |
| day | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 1.000 | 0.007 | 0.000 | 0.141 | 0.032 | -0.000 | 0.000 | 0.000 | 0.000 | 0.003 | 0.012 | 0.000 | 0.004 | -0.000 | 0.027 | 0.002 | 0.000 | 0.000 | 0.000 | 0.452 | 0.000 | -0.002 | 0.065 | 0.000 |
| discount_pct | 0.034 | 0.028 | 0.004 | 0.005 | 0.005 | 0.007 | 1.000 | 0.090 | 0.006 | 0.006 | 0.004 | 0.000 | 0.069 | 0.005 | 0.478 | 0.009 | 0.045 | 1.000 | 0.037 | 0.007 | 0.001 | 0.002 | 0.008 | 0.062 | 0.009 | 0.168 | 0.006 | 0.009 | 0.006 |
| gross_sales | 0.071 | 0.072 | 0.083 | 0.045 | 0.050 | 0.000 | 0.090 | 1.000 | 0.008 | 0.075 | 0.044 | -0.001 | 0.628 | 0.071 | -0.053 | 0.013 | 0.998 | 0.173 | 0.607 | -0.001 | -0.000 | 0.093 | 0.049 | 0.123 | -0.004 | 0.691 | 0.059 | 0.014 | 0.000 |
| is_holiday | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.141 | 0.006 | 0.008 | 1.000 | 0.064 | 0.000 | 0.002 | 0.000 | 0.000 | 0.004 | 0.126 | 0.008 | 0.005 | 0.000 | 0.091 | 0.000 | 0.000 | 0.000 | 0.000 | 0.076 | 0.009 | 0.099 | 0.214 | 0.000 |
| is_weekend | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.032 | 0.006 | 0.075 | 0.064 | 1.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.002 | 0.038 | 0.082 | 0.000 | 0.000 | 0.110 | 0.000 | 0.000 | 0.000 | 0.000 | 0.060 | 0.104 | 1.000 | 0.006 | 0.003 |
| latitude | 0.038 | 0.029 | 0.592 | 1.000 | 0.866 | -0.000 | 0.004 | 0.044 | 0.000 | 0.000 | 1.000 | 0.001 | 0.009 | 0.683 | -0.003 | -0.000 | 0.044 | 0.008 | 0.007 | -0.000 | -0.000 | 0.001 | 1.000 | 0.059 | 0.000 | 0.049 | -0.000 | -0.000 | 0.000 |
| lead_time_days | 0.000 | 0.000 | 0.001 | 0.000 | 0.000 | 0.000 | 0.000 | -0.001 | 0.002 | 0.000 | 0.001 | 1.000 | -0.001 | -0.000 | -0.000 | -0.000 | -0.001 | 0.002 | -0.001 | 0.001 | -0.001 | 0.000 | 0.000 | 0.000 | 0.001 | 0.000 | -0.001 | -0.001 | 0.000 |
| list_price | 0.311 | 0.307 | 0.051 | 0.047 | 0.048 | 0.000 | 0.069 | 0.628 | 0.000 | 0.000 | 0.009 | -0.001 | 1.000 | 0.004 | 0.017 | 0.000 | 0.634 | 0.136 | 0.964 | 0.000 | 0.001 | 0.000 | 0.051 | 0.380 | -0.000 | -0.062 | 0.000 | 0.000 | 0.000 |
| longitude | 0.046 | 0.039 | 0.761 | 1.000 | 0.853 | 0.000 | 0.005 | 0.071 | 0.000 | 0.000 | 0.683 | -0.000 | 0.004 | 1.000 | -0.001 | 0.000 | 0.071 | 0.011 | 0.005 | 0.000 | 0.000 | 0.000 | 1.000 | 0.068 | -0.000 | 0.082 | 0.000 | 0.000 | 0.000 |
| margin_pct | 0.023 | 0.022 | 0.003 | 0.003 | 0.003 | 0.003 | 0.478 | -0.053 | 0.004 | 0.002 | -0.003 | -0.000 | 0.017 | -0.001 | 1.000 | 0.001 | -0.032 | 0.769 | -0.196 | 0.000 | -0.000 | 0.000 | 0.004 | 0.032 | 0.002 | -0.078 | -0.001 | 0.001 | 0.002 |
| month | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.012 | 0.009 | 0.013 | 0.126 | 0.038 | -0.000 | -0.000 | 0.000 | 0.000 | 0.001 | 1.000 | 0.014 | 0.011 | 0.000 | 0.043 | -0.001 | 0.003 | 0.000 | 0.000 | 0.000 | 0.018 | 0.002 | 0.966 | 0.000 |
| net_sales | 0.074 | 0.083 | 0.088 | 0.048 | 0.053 | 0.000 | 0.045 | 0.998 | 0.008 | 0.082 | 0.044 | -0.001 | 0.634 | 0.071 | -0.032 | 0.014 | 1.000 | 0.088 | 0.613 | -0.001 | -0.000 | 0.109 | 0.052 | 0.133 | -0.003 | 0.683 | 0.060 | 0.014 | 0.000 |
| promo_flag | 0.067 | 0.056 | 0.008 | 0.011 | 0.011 | 0.004 | 1.000 | 0.173 | 0.005 | 0.000 | 0.008 | 0.002 | 0.136 | 0.011 | 0.769 | 0.011 | 0.088 | 1.000 | 0.073 | 0.005 | 0.000 | 0.001 | 0.017 | 0.123 | 0.014 | 0.325 | 0.002 | 0.007 | 0.005 |
| purchase_cost | 0.154 | 0.141 | 0.034 | 0.028 | 0.031 | -0.000 | 0.037 | 0.607 | 0.000 | 0.000 | 0.007 | -0.001 | 0.964 | 0.005 | -0.196 | 0.000 | 0.613 | 0.073 | 1.000 | 0.000 | 0.001 | 0.000 | 0.032 | 0.189 | -0.000 | -0.068 | 0.000 | 0.000 | 0.000 |
| rain_mm | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.027 | 0.007 | -0.001 | 0.091 | 0.110 | -0.000 | 0.001 | 0.000 | 0.000 | 0.000 | 0.043 | -0.001 | 0.005 | 0.000 | 1.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.006 | -0.001 | -0.011 | 0.030 | 0.050 |
| stock_on_hand | 0.000 | 0.001 | 0.000 | 0.001 | 0.002 | 0.002 | 0.001 | -0.000 | 0.000 | 0.000 | -0.000 | -0.001 | 0.001 | 0.000 | -0.000 | -0.001 | -0.000 | 0.000 | 0.001 | 0.000 | 1.000 | 0.000 | 0.001 | 0.000 | 0.001 | -0.001 | -0.000 | -0.001 | 0.000 |
| stock_out_flag | 0.003 | 0.000 | 0.000 | 0.001 | 0.000 | 0.000 | 0.002 | 0.093 | 0.000 | 0.000 | 0.001 | 0.000 | 0.000 | 0.000 | 0.000 | 0.003 | 0.109 | 0.001 | 0.000 | 0.000 | 0.000 | 1.000 | 0.000 | 0.001 | 0.000 | 0.121 | 0.002 | 0.002 | 0.000 |
| store_id | 0.060 | 0.053 | 1.000 | 1.000 | 1.000 | 0.000 | 0.008 | 0.049 | 0.000 | 0.000 | 1.000 | 0.000 | 0.051 | 1.000 | 0.004 | 0.000 | 0.052 | 0.017 | 0.032 | 0.000 | 0.001 | 0.000 | 1.000 | 0.066 | 0.000 | 0.066 | 0.000 | 0.000 | 0.000 |
| subcategory | 0.058 | 1.000 | 0.071 | 0.067 | 0.072 | 0.000 | 0.062 | 0.123 | 0.000 | 0.000 | 0.059 | 0.000 | 0.380 | 0.068 | 0.032 | 0.000 | 0.133 | 0.123 | 0.189 | 0.000 | 0.000 | 0.001 | 0.066 | 1.000 | 0.000 | 0.108 | 0.000 | 0.000 | 0.000 |
| temperature | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.452 | 0.009 | -0.004 | 0.076 | 0.060 | 0.000 | 0.001 | -0.000 | -0.000 | 0.002 | 0.000 | -0.003 | 0.014 | -0.000 | 0.006 | 0.001 | 0.000 | 0.000 | 0.000 | 1.000 | -0.005 | -0.041 | 0.034 | 0.080 |
| units_sold | 0.064 | 0.074 | 0.114 | 0.058 | 0.064 | 0.000 | 0.168 | 0.691 | 0.009 | 0.104 | 0.049 | 0.000 | -0.062 | 0.082 | -0.078 | 0.018 | 0.683 | 0.325 | -0.068 | -0.001 | -0.001 | 0.121 | 0.066 | 0.108 | -0.005 | 1.000 | 0.078 | 0.019 | 0.002 |
| weekday | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | -0.002 | 0.006 | 0.059 | 0.099 | 1.000 | -0.000 | -0.001 | 0.000 | 0.000 | -0.001 | 0.002 | 0.060 | 0.002 | 0.000 | -0.011 | -0.000 | 0.002 | 0.000 | 0.000 | -0.041 | 0.078 | 1.000 | 0.005 | 0.003 |
| weekofyear | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.065 | 0.009 | 0.014 | 0.214 | 0.006 | -0.000 | -0.001 | 0.000 | 0.000 | 0.001 | 0.966 | 0.014 | 0.007 | 0.000 | 0.030 | -0.001 | 0.002 | 0.000 | 0.000 | 0.034 | 0.019 | 0.005 | 1.000 | 0.000 |
| year | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.000 | 0.006 | 0.000 | 0.000 | 0.003 | 0.000 | 0.000 | 0.000 | 0.000 | 0.002 | 0.000 | 0.000 | 0.005 | 0.000 | 0.050 | 0.000 | 0.000 | 0.000 | 0.000 | 0.080 | 0.002 | 0.003 | 0.000 | 1.000 |
Missing values
Sample
| date | year | month | day | weekofyear | weekday | is_weekend | is_holiday | temperature | rain_mm | store_id | country | city | channel | latitude | longitude | sku_id | sku_name | category | subcategory | brand | units_sold | list_price | discount_pct | promo_flag | gross_sales | net_sales | stock_on_hand | stock_out_flag | lead_time_days | supplier_id | purchase_cost | margin_pct | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 2021-01-01 | 2021 | 1 | 1 | 53 | 4 | 0 | 1 | 8.44 | 1.24 | STORE0001 | Germany | Berlin | Hypermarket | 52.52586 | 13.39071 | SKU0086 | BrandB Shampoo | Personal Care | Shampoo | BrandB | 16 | 10.49 | 0.10 | 1 | 167.84 | 151.06 | 248 | 0 | 11 | S008 | 7.53 | 0.182 |
| 1 | 2021-01-02 | 2021 | 1 | 2 | 53 | 5 | 1 | 0 | 12.61 | 1.12 | STORE0001 | Germany | Berlin | Hypermarket | 52.52586 | 13.39071 | SKU0086 | BrandB Shampoo | Personal Care | Shampoo | BrandB | 12 | 10.49 | 0.00 | 0 | 125.88 | 125.88 | 238 | 0 | 6 | S057 | 5.19 | 0.505 |
| 2 | 2021-01-03 | 2021 | 1 | 3 | 53 | 6 | 1 | 0 | 12.02 | 2.69 | STORE0001 | Germany | Berlin | Hypermarket | 52.52586 | 13.39071 | SKU0086 | BrandB Shampoo | Personal Care | Shampoo | BrandB | 38 | 10.49 | 0.30 | 1 | 398.62 | 279.03 | 238 | 0 | 6 | S017 | 5.59 | 0.168 |
| 3 | 2021-01-04 | 2021 | 1 | 4 | 1 | 0 | 0 | 0 | 7.76 | 4.65 | STORE0001 | Germany | Berlin | Hypermarket | 52.52586 | 13.39071 | SKU0086 | BrandB Shampoo | Personal Care | Shampoo | BrandB | 8 | 10.49 | 0.00 | 0 | 83.92 | 83.92 | 216 | 0 | 7 | S012 | 7.81 | 0.255 |
| 4 | 2021-01-05 | 2021 | 1 | 5 | 1 | 1 | 0 | 0 | 11.16 | 1.77 | STORE0001 | Germany | Berlin | Hypermarket | 52.52586 | 13.39071 | SKU0086 | BrandB Shampoo | Personal Care | Shampoo | BrandB | 17 | 10.49 | 0.20 | 1 | 178.33 | 142.66 | 372 | 0 | 8 | S038 | 7.62 | 0.073 |
| 5 | 2021-01-06 | 2021 | 1 | 6 | 1 | 2 | 0 | 0 | 13.29 | 1.46 | STORE0001 | Germany | Berlin | Hypermarket | 52.52586 | 13.39071 | SKU0086 | BrandB Shampoo | Personal Care | Shampoo | BrandB | 11 | 10.49 | 0.00 | 0 | 115.39 | 115.39 | 353 | 0 | 4 | S017 | 5.35 | 0.490 |
| 6 | 2021-01-07 | 2021 | 1 | 7 | 1 | 3 | 0 | 0 | 6.19 | 11.58 | STORE0001 | Germany | Berlin | Hypermarket | 52.52586 | 13.39071 | SKU0086 | BrandB Shampoo | Personal Care | Shampoo | BrandB | 15 | 10.49 | 0.20 | 1 | 157.35 | 125.88 | 183 | 0 | 9 | S003 | 7.19 | 0.115 |
| 7 | 2021-01-08 | 2021 | 1 | 8 | 1 | 4 | 0 | 0 | 13.00 | 2.90 | STORE0001 | Germany | Berlin | Hypermarket | 52.52586 | 13.39071 | SKU0086 | BrandB Shampoo | Personal Care | Shampoo | BrandB | 8 | 10.49 | 0.00 | 0 | 83.92 | 83.92 | 239 | 0 | 5 | S007 | 6.78 | 0.354 |
| 8 | 2021-01-09 | 2021 | 1 | 9 | 1 | 5 | 1 | 0 | 9.56 | 0.26 | STORE0001 | Germany | Berlin | Hypermarket | 52.52586 | 13.39071 | SKU0086 | BrandB Shampoo | Personal Care | Shampoo | BrandB | 15 | 10.49 | 0.15 | 1 | 157.35 | 133.75 | 272 | 0 | 8 | S039 | 5.38 | 0.337 |
| 9 | 2021-01-10 | 2021 | 1 | 10 | 1 | 6 | 1 | 0 | 13.42 | 0.72 | STORE0001 | Germany | Berlin | Hypermarket | 52.52586 | 13.39071 | SKU0086 | BrandB Shampoo | Personal Care | Shampoo | BrandB | 17 | 10.49 | 0.10 | 1 | 178.33 | 160.50 | 250 | 0 | 7 | S028 | 6.99 | 0.234 |
| date | year | month | day | weekofyear | weekday | is_weekend | is_holiday | temperature | rain_mm | store_id | country | city | channel | latitude | longitude | sku_id | sku_name | category | subcategory | brand | units_sold | list_price | discount_pct | promo_flag | gross_sales | net_sales | stock_on_hand | stock_out_flag | lead_time_days | supplier_id | purchase_cost | margin_pct | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1099990 | 2022-09-03 | 2022 | 9 | 3 | 35 | 5 | 1 | 0 | 15.42 | 1.26 | STORE0013 | Netherlands | Amsterdam | Convenience | 52.36231 | 4.9585 | SKU0073 | BrandA Softener | Home Care | Softener | BrandA | 10 | 4.99 | 0.0 | 0 | 49.90 | 49.90 | 242 | 0 | 3 | S023 | 3.36 | 0.327 |
| 1099991 | 2022-09-04 | 2022 | 9 | 4 | 35 | 6 | 1 | 0 | 7.25 | 0.92 | STORE0013 | Netherlands | Amsterdam | Convenience | 52.36231 | 4.9585 | SKU0073 | BrandA Softener | Home Care | Softener | BrandA | 6 | 4.99 | 0.0 | 0 | 29.94 | 29.94 | 229 | 0 | 8 | S010 | 2.69 | 0.460 |
| 1099992 | 2022-09-05 | 2022 | 9 | 5 | 36 | 0 | 0 | 0 | 11.77 | 0.06 | STORE0013 | Netherlands | Amsterdam | Convenience | 52.36231 | 4.9585 | SKU0073 | BrandA Softener | Home Care | Softener | BrandA | 3 | 4.99 | 0.0 | 0 | 14.97 | 14.97 | 326 | 0 | 8 | S056 | 3.68 | 0.262 |
| 1099993 | 2022-09-06 | 2022 | 9 | 6 | 36 | 1 | 0 | 0 | 8.77 | 4.23 | STORE0013 | Netherlands | Amsterdam | Convenience | 52.36231 | 4.9585 | SKU0073 | BrandA Softener | Home Care | Softener | BrandA | 5 | 4.99 | 0.0 | 0 | 24.95 | 24.95 | 261 | 0 | 4 | S034 | 3.51 | 0.297 |
| 1099994 | 2022-09-07 | 2022 | 9 | 7 | 36 | 2 | 0 | 0 | 11.01 | 1.46 | STORE0013 | Netherlands | Amsterdam | Convenience | 52.36231 | 4.9585 | SKU0073 | BrandA Softener | Home Care | Softener | BrandA | 7 | 4.99 | 0.0 | 0 | 34.93 | 34.93 | 361 | 0 | 6 | S050 | 3.60 | 0.278 |
| 1099995 | 2022-09-08 | 2022 | 9 | 8 | 36 | 3 | 0 | 0 | 12.36 | 0.05 | STORE0013 | Netherlands | Amsterdam | Convenience | 52.36231 | 4.9585 | SKU0073 | BrandA Softener | Home Care | Softener | BrandA | 6 | 4.99 | 0.0 | 0 | 29.94 | 29.94 | 233 | 0 | 1 | S023 | 3.07 | 0.384 |
| 1099996 | 2022-09-09 | 2022 | 9 | 9 | 36 | 4 | 0 | 0 | 12.51 | 5.96 | STORE0013 | Netherlands | Amsterdam | Convenience | 52.36231 | 4.9585 | SKU0073 | BrandA Softener | Home Care | Softener | BrandA | 7 | 4.99 | 0.0 | 0 | 34.93 | 34.93 | 219 | 0 | 7 | S028 | 2.53 | 0.493 |
| 1099997 | 2022-09-10 | 2022 | 9 | 10 | 36 | 5 | 1 | 0 | 12.98 | 6.26 | STORE0013 | Netherlands | Amsterdam | Convenience | 52.36231 | 4.9585 | SKU0073 | BrandA Softener | Home Care | Softener | BrandA | 9 | 4.99 | 0.0 | 0 | 44.91 | 44.91 | 305 | 0 | 10 | S027 | 2.46 | 0.507 |
| 1099998 | 2022-09-11 | 2022 | 9 | 11 | 36 | 6 | 1 | 0 | 16.72 | 0.20 | STORE0013 | Netherlands | Amsterdam | Convenience | 52.36231 | 4.9585 | SKU0073 | BrandA Softener | Home Care | Softener | BrandA | 5 | 4.99 | 0.0 | 0 | 24.95 | 24.95 | 282 | 0 | 9 | S058 | 2.92 | 0.414 |
| 1099999 | 2022-09-12 | 2022 | 9 | 12 | 37 | 0 | 0 | 0 | 13.41 | 3.60 | STORE0013 | Netherlands | Amsterdam | Convenience | 52.36231 | 4.9585 | SKU0073 | BrandA Softener | Home Care | Softener | BrandA | 6 | 4.99 | 0.0 | 0 | 29.94 | 29.94 | 206 | 0 | 3 | S051 | 2.87 | 0.425 |